18,781 research outputs found
Recommended from our members
Robust variable selection for nonlinear models with diverging number of parameters
We focus on the problem of simultaneous variable selection and estimation for nonlinear models based on modal regression (MR), when the number of coefficients diverges with sample size. With appropriate selection of the tuning parameters, the resulting estimator is shown to be consistent and to enjoy the oracle properties
Recommended from our members
Robust variable selection in partially varying coefficient single-index model
By combining basis function approximations and smoothly clipped absolute deviation (SCAD) penalty, this paper proposes a robust variable selection procedure for a partially varying coefficient single-index model based on modal regression. The proposed procedure simultaneously selects significant variables in the parametric components and the nonparametric components. With appropriate selection of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. Furthermore, we also discuss the bandwidth selection and propose a modified expectation-maximization (EM)-type algorithm for the proposed estimation procedure. The finite sample properties of the proposed estimators are illustrated by some simulation examples.The research of Zhu is partially supported by National Natural Science Foundation of China (NNSFC) under Grants 71171075, 71221001 and 71031004. The research of Yu is supported by NNSFC under Grant 11261048
Exploiting Sentence Embedding for Medical Question Answering
Despite the great success of word embedding, sentence embedding remains a
not-well-solved problem. In this paper, we present a supervised learning
framework to exploit sentence embedding for the medical question answering
task. The learning framework consists of two main parts: 1) a sentence
embedding producing module, and 2) a scoring module. The former is developed
with contextual self-attention and multi-scale techniques to encode a sentence
into an embedding tensor. This module is shortly called Contextual
self-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs two
scoring strategies: Semantic Matching Scoring (SMS) and Semantic Association
Scoring (SAS). SMS measures similarity while SAS captures association between
sentence pairs: a medical question concatenated with a candidate choice, and a
piece of corresponding supportive evidence. The proposed framework is examined
by two Medical Question Answering(MedicalQA) datasets which are collected from
real-world applications: medical exam and clinical diagnosis based on
electronic medical records (EMR). The comparison results show that our proposed
framework achieved significant improvements compared to competitive baseline
approaches. Additionally, a series of controlled experiments are also conducted
to illustrate that the multi-scale strategy and the contextual self-attention
layer play important roles for producing effective sentence embedding, and the
two kinds of scoring strategies are highly complementary to each other for
question answering problems.Comment: 8 page
- …